Skip to content

Publish private extension images to serverless-testing account (093468662994)#1183

Draft
duncanista wants to merge 1 commit intomainfrom
jordan.gonzalez/publish-private-images-to-testing-account
Draft

Publish private extension images to serverless-testing account (093468662994)#1183
duncanista wants to merge 1 commit intomainfrom
jordan.gonzalez/publish-private-images-to-testing-account

Conversation

@duncanista
Copy link
Copy Markdown
Contributor

Overview

Update the "publish private images" CI job to push extension container images to the serverless-testing account (093468662994) instead of the sandbox account (425362996713).

The self-monitoring container runtimes (LOD, LMI) run in 093468662994. Co-locating the extension images there eliminates cross-account ECR pulls during CDK Docker builds.

Changes

  • build_private_image.sh: Target 093468662994.dkr.ecr.us-east-1.amazonaws.com/datadog-lambda-extension (configurable via PRIVATE_IMAGE_ECR_ACCOUNT / PRIVATE_IMAGE_ECR_REPO env vars). Layer version discovery queries sandbox by ARN so it works regardless of assumed account.
  • environments.yaml: Add serverless_testing environment entry for 093468662994
  • pipeline.yaml.tpl: "publish private images" job assumes serverless_testing role instead of sandbox

Prerequisites

  • ECR repo datadog-lambda-extension must exist in 093468662994 (created via serverless-self-monitoring#637)
  • IAM role lambda-extension-image-publisher must exist in 093468662994 with ECR push permissions
  • Vault secret serverless-testing-publish-externalid must be configured at kv/k8s/gitlab-runner/datadog-lambda-extension/secrets

Testing

  • Verify generated pipeline YAML has correct environment for "publish private images" jobs
  • Trigger manual "publish private images" job and confirm push to 093468662994 ECR
  • Verify LOD/LMI can pull from 093468662994 ECR during CDK deploy

Update build_private_image.sh to push to 093468662994
(serverless-testing) instead of 425362996713 (sandbox).

The self-monitoring container runtimes (LOD, LMI) run in
093468662994, so co-locating the extension images there
removes the need for cross-account ECR pulls during CDK
Docker builds.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the CI “publish private images” workflow to publish Lambda extension container images into the serverless-testing AWS account (093468662994) to avoid cross-account ECR pulls during downstream builds.

Changes:

  • Switch the “publish private images” GitLab job to assume the serverless_testing environment role.
  • Update build_private_image.sh to target 093468662994.dkr.ecr.us-east-1.amazonaws.com/datadog-lambda-extension by default (configurable via env vars) and to look up the next tag by querying the sandbox layer ARN.
  • Add a new serverless_testing entry to the environments datasource.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
.gitlab/templates/pipeline.yaml.tpl Makes the private image publish job assume the serverless-testing environment instead of sandbox.
.gitlab/scripts/build_private_image.sh Changes ECR destination defaults and updates layer-version lookup logic for image tagging.
.gitlab/datasources/environments.yaml Adds the serverless_testing environment definition (account/role/external-id).

serverless_testing:
external_id: serverless-testing-publish-externalid
role_to_assume: lambda-extension-image-publisher
account: 093468662994
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new serverless_testing environment entry is included in the range $environment_name, $environment := (ds "environments").environments loop in .gitlab/templates/pipeline.yaml.tpl, which generates publish layer <env> jobs that expect add_layer_version_permissions and automatically_bump_version. Since those keys are missing here, the generated pipeline will either render invalid values or run publish-layer jobs with unintended defaults/behavior.

Consider either (a) adding the missing fields with explicit values for serverless_testing and ensuring the role can publish layers, or (b) changing the pipeline template/data model to exclude serverless_testing from the layer-publish environment loop (e.g., add a flag to environments and skip when false).

Suggested change
account: 093468662994
account: 093468662994
add_layer_version_permissions: 0
automatically_bump_version: 0

Copilot uses AI. Check for mistakes.
Comment on lines +33 to 34
latest_version=$(aws lambda list-layer-versions --region us-east-1 --layer-name "arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}" --query 'LayerVersions[0].Version || `0`')
VERSION=$(($latest_version + 1))
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script now calls aws lambda list-layer-versions against a sandbox layer ARN (425362996713), but the CI job that runs it was updated to assume the serverless_testing role. Unless that role is granted lambda:ListLayerVersions on the sandbox layer resource, this call will fail with AccessDenied and the image publish job will exit (due to set -e).

Either ensure the lambda-extension-image-publisher role has cross-account permission to list versions for the sandbox layer ARN(s), or adjust the workflow so the layer version lookup is performed with sandbox credentials (e.g., assume the sandbox role just for this lookup).

Suggested change
latest_version=$(aws lambda list-layer-versions --region us-east-1 --layer-name "arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}" --query 'LayerVersions[0].Version || `0`')
VERSION=$(($latest_version + 1))
SANDBOX_LAYER_ARN="arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}"
if [ -n "${SANDBOX_LAYER_LOOKUP_ROLE_ARN:-}" ]; then
printf "Assuming sandbox role for layer version lookup in account %s...\n" "$SANDBOX_ACCOUNT"
assume_role_output=$(aws sts assume-role \
--region us-east-1 \
--role-arn "$SANDBOX_LAYER_LOOKUP_ROLE_ARN" \
--role-session-name "build-private-image-layer-lookup" \
--query 'Credentials.[AccessKeyId,SecretAccessKey,SessionToken]' \
--output text)
read -r sandbox_access_key_id sandbox_secret_access_key sandbox_session_token <<EOF
$assume_role_output
EOF
latest_version=$(
AWS_ACCESS_KEY_ID="$sandbox_access_key_id" \
AWS_SECRET_ACCESS_KEY="$sandbox_secret_access_key" \
AWS_SESSION_TOKEN="$sandbox_session_token" \
aws lambda list-layer-versions \
--region us-east-1 \
--layer-name "$SANDBOX_LAYER_ARN" \
--query 'LayerVersions[0].Version || `0`' \
--output text
)
else
latest_version=$(aws lambda list-layer-versions \
--region us-east-1 \
--layer-name "$SANDBOX_LAYER_ARN" \
--query 'LayerVersions[0].Version || `0`' \
--output text)
fi
if ! [[ "$latest_version" =~ ^[0-9]+$ ]]; then
printf "Failed to resolve a numeric sandbox layer version for %s. Configure SANDBOX_LAYER_LOOKUP_ROLE_ARN with a role that can call lambda:ListLayerVersions on the sandbox layer, or grant the current role that permission.\n" "$SANDBOX_LAYER_ARN" >&2
exit 1
fi
VERSION=$((latest_version + 1))

Copilot uses AI. Check for mistakes.
@duncanista duncanista marked this pull request as draft April 14, 2026 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants